Model Selection

Multilingual TTS

# Multilingual TTS

The Kyoto Station Text-to-Speech (TTS) model is a model for streaming text-to-speech, supporting real-time speech generation and multilingual processing.

Speech Synthesis Supports Multiple Languages

Spark TTS 0.5B GGUF

Spark-TTS-0.5B is a quantized version based on prince-canuma/Spark-TTS-0.5B, supporting text-to-speech tasks in English and Chinese.

Speech Synthesis Supports Multiple Languages

Llama OuteTTS 1.0 1B 3bit

This is a 3-bit quantized text-to-speech model in MLX format, supporting multiple languages.

Speech Synthesis Supports Multiple Languages

IndicF5 is a near-human-level multilingual text-to-speech (TTS) model supporting 11 Indian languages, trained on 1417 hours of high-quality speech data.

Speech Synthesis Other

This is a text-to-speech model based on the Apache-2.0 license, supporting English language processing.

Speech Synthesis English

Fish Speech 1.5

Fish Speech V1.5 is a leading text-to-speech (TTS) model, trained on over 1 million hours of multilingual audio data.

Speech Synthesis Supports Multiple Languages

Zonos V0.1 Transformer

Zonos-v0.1 is a leading open-weight text-to-speech model trained on over 200,000 hours of multilingual speech data, delivering expressiveness and quality comparable to or even surpassing top-tier TTS service providers.

Speech Synthesis

Outetts 0.3 1B GGUF

OuteTTS-0.3-1B is a multilingual text-to-speech model developed by OuteAI, supporting English, Chinese, Japanese, Korean, French, and German.

Speech Synthesis Supports Multiple Languages

Outetts 0.3 1B GGUF

OuteTTS-0.3-1B is a multilingual text-to-speech model developed by OuteAI and quantized by Second State Inc.

Speech Synthesis Supports Multiple Languages

Outetts 0.3 500M GGUF

OuteTTS-0.3-500M is a multilingual text-to-speech model developed by OuteAI and released under the cc-by-nc-4.0 license.

Speech Synthesis Supports Multiple Languages

Outetts 0.3 500M GGUF

OuteTTS-0.3-500M is a multilingual text-to-speech model supporting English, Chinese, Japanese, Korean, French, and German.

Speech Synthesis Supports Multiple Languages

Outetts 0.2 500M GGUF

OuteTTS-0.2-500M is a multilingual text-to-speech model developed by OuteAI, supporting English, Chinese, Japanese, and Korean.

Speech Synthesis Supports Multiple Languages

Outetts 0.2 500M GGUF

OuteTTS-0.2-500M is a multilingual text-to-speech model supporting English, Chinese, Japanese, and Korean.

Speech Synthesis Supports Multiple Languages

Helpingai TTS V1

HelpingAI-TTS-v1 is a next-generation text-to-speech (TTS) tool focused on personalization, emotional expression, and clarity, supporting multiple languages and emotion customization.

Speech Synthesis

Transformers Supports Multiple Languages

Fish Speech 1.5

Leading text-to-speech (TTS) model trained on over 1 million hours of multilingual audio data

Speech Synthesis Supports Multiple Languages

Fish Speech 1.5 Base

Fish Speech 1.5 is a multilingual text-to-speech model that supports multiple languages and can be used without an access token.

Speech Synthesis Supports Multiple Languages

Indri 0.1 124m Tts GGUF

Indri is a text-to-speech (TTS) model supporting English and Hindi, with a parameter size of 124M, optimized for CPU inference in GGUF format.

Speech Synthesis Supports Multiple Languages

Speecht5 Tts Tamil

A Tamil speech synthesis model fine-tuned on the common_voice_17_0 dataset based on microsoft/speecht5_tts

Speech Synthesis

Fish Agent V0.1 3b

A groundbreaking speech-to-speech model capable of accurately capturing and generating environmental audio information, while featuring advanced text-to-speech capabilities.

Speech Synthesis Supports Multiple Languages

Speect5 Common Voice Hindi

A Hindi speech synthesis model fine-tuned on the common_voice_17_0 dataset based on microsoft/speecht5_tts

Speech Synthesis

Transformers Other

XTTS Hindi Finetuned

This is a fine-tuned version of the XTTS v2 model developed by Coqui-AI, specifically optimized for Hindi speech datasets, supporting voice cloning and multilingual speech generation.

Speech Synthesis

Speecht5 Tts Vie

This is a Vietnamese text-to-speech (TTS) model fine-tuned on the microsoft/speecht5_tts model, trained on the generator dataset.

Speech Synthesis

Transformers Other

viⓍTTS is a voice generation model supporting 18 languages, specifically optimized for Vietnamese, achieving cross-lingual voice cloning with just 6 seconds of audio.

Speech Synthesis

Transformers Other

Speecht5 Finetuned Voxpopuli Ro

A text-to-speech model fine-tuned on the VoxPopuli dataset based on microsoft/speecht5_tts

Speech Synthesis

Speecht5 Tts Portuguese

A Portuguese text-to-speech model fine-tuned based on Microsoft's SpeechT5 architecture, supporting high-quality speech synthesis

Speech Synthesis

Transformers Other

flavioegoncalves

Speecht5 Finetuned Multilingual Librispeech De

A text-to-speech model fine-tuned on the German LibriSpeech dataset based on Microsoft's SpeechT5 model

Speech Synthesis

Transformers German

Burmese text-to-speech model developed by Meta, part of the Massively Multilingual Speech (MMS) project

Speech Synthesis

Mms Tts Cmo Script Khmer

A Central Mnong text-to-speech model developed by Meta, supporting conversion of text to natural speech

Speech Synthesis

A Kyrgyz text-to-speech model developed by Meta, based on the VITS architecture, supporting high-quality speech synthesis.

Speech Synthesis

Chichewa text-to-speech model developed by Meta AI, based on VITS architecture, supporting high-quality speech synthesis

Speech Synthesis

Vietnamese text-to-speech model developed by Meta, based on the VITS architecture, supporting high-quality speech synthesis

Speech Synthesis

Malayalam text-to-speech model in Facebook's MMS project, implementing end-to-end speech synthesis based on VITS architecture

Speech Synthesis

Bark is a Transformer-based multilingual text-to-audio model developed by Suno, capable of generating realistic speech, music, and non-verbal sounds

Speech Synthesis

Transformers Supports Multiple Languages

Speecht5 Finetuned Google Fleurs Greek

Greek text-to-speech model fine-tuned based on microsoft/speecht5_tts

Speech Synthesis

Speecht5 Finetuned Common Voice 13 0 Euskera

A text-to-speech model fine-tuned on the Common Voice 13.0 Basque dataset based on Microsoft's SpeechT5 architecture

Speech Synthesis

Speecht5 Tts Finetuned Voxpopuli Sk V2

A text-to-speech model fine-tuned on the Slovak VoxPopuli dataset based on Microsoft's SpeechT5 architecture

Speech Synthesis

Korean text-to-speech model from Meta's Massively Multilingual Speech project, supporting natural speech conversion from Korean text

Speech Synthesis

Speecht5 Tts Common Voice Zh

Dutch text-to-speech model fine-tuned based on microsoft/speecht5_tts

Speech Synthesis

Transformers Chinese

Speecht5 Tts Commonvoice Ca

Catalan text-to-speech model based on the SpeechT5 architecture, fine-tuned on the Common Voice 11.0 dataset

Speech Synthesis

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase